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Abstract 


We  consider  a  variant  of  the  Boolean  satisfiability  problem  where  a  subset  £  of  the  propositional 
variables  appearing  in  formula  Fsat  encode  a  symmetric,  transitive,  binary  relation  over  N  ele¬ 
ments.  Each  of  these  relational  variables,  e2J,  for  1  <  i  <  j  <  N,  expresses  whether  or  not  the 
relation  holds  between  elements  i  and  j.  The  task  is  to  either  find  a  satisfying  assignment  to  Fsat 
that  also  satisfies  all  transitivity  constraints  over  the  relational  variables  (e.g.,  e1;2  A  e2,3  =>  eij3), 
or  to  prove  that  no  such  assignment  exists.  Solving  this  satisfiability  problem  is  the  final  and  most 
difficult  step  in  our  decision  procedure  for  a  logic  of  equality  with  uninterpreted  functions.  This 
procedure  forms  the  core  of  our  tool  for  verifying  pipelined  microprocessors. 

To  use  a  conventional  Boolean  satisfiability  checker,  we  augment  the  set  of  clauses  expressing  Fsat 
with  clauses  expressing  the  transitivity  constraints.  We  consider  methods  to  reduce  the  number  of 
such  clauses  based  on  the  sparse  structure  of  the  relational  variables. 

To  use  Ordered  Binary  Decision  Diagrams  (OBDDs),  we  show  that  for  some  sets  £,  the  OBDD 
representation  of  the  transitivity  constraints  has  exponential  size  for  all  possible  variable  orderings. 
By  considering  only  those  relational  variables  that  occur  in  the  OBDD  representation  of  Fsat,  our 
experiments  show  that  we  can  readily  construct  an  OBDD  representation  of  the  relevant  transitivity 
constraints  and  thus  solve  the  constrained  satisfiability  problem. 


1  Introduction 


Consider  the  following  variant  of  the  Boolean  satisfiability  problem.  We  are  given  a  Boolean 
formula  Fsat  over  a  set  of  variables  V.  A  subset  £  C  V  symbolically  encodes  a  binary  relation 
over  N  elements  that  is  reflexive,  symmetric,  and  transitive.  Each  of  these  relational  variables, 
e8-j,  where  1  <  i  <  j  <  N,  expresses  whether  or  not  the  relation  holds  between  elements  i  and 
j.  Typically,  £  will  be  “sparse,”  containing  much  fewer  than  the  N(N  —  l)/2  possible  variables. 
Note  that  when  g  £  for  some  value  of  i  and  of  j,  this  does  not  imply  that  the  relation  does 
not  hold  between  elements  i  and  j.  It  simply  indicates  that  Fsa t  does  not  directly  depend  on  the 
relation  between  elements  i  and  j. 

A  transitivity  constraint  is  a  formula  of  the  form 

e[n,h]  AeM  A---  Ae[4_i,4]  =^>  e[tli4]  (1) 

where  equals  et-j  when  i  <  j  and  equals  th{  when  i  >  j.  Let  Trans(£ )  denote  the  set  of 
all  transitivity  constraints  that  can  be  formed  from  the  relational  variables.  Our  task  is  to  find 
an  assignment  x:  V  -4  {0, 1}  that  satisfies  Fsat,  as  well  as  every  constraint  in  Trans(£).  Goel, 
et  al.  [GSZAS98]  have  shown  this  problem  is  NP-hard,  even  when  Fsat  is  given  as  an  Ordered 
Binary  Decision  Diagram  (OBDD)  [Bry86].  Normally,  Boolean  satisfiability  is  trivial  given  an 
OBDD  representation  of  a  formula. 

We  are  motivated  to  solve  this  problem  as  part  of  a  tool  for  verifying  pipelined  microprocessors 
[VB99].  Our  tool  abstracts  the  operation  of  the  datapath  as  a  set  of  uninterpreted  functions  and 
uninterpreted  predicates  operating  on  symbolic  data.  We  prove  that  a  pipelined  processor  has 
behavior  matching  that  of  an  unpipelined  reference  model  using  the  symbolic  flushing  technique 
developed  by  Burch  and  Dill  [BD94].  The  major  computational  task  is  to  decide  the  validity 
of  a  formula  Fyer  in  a  logic  of  equality  with  uninterpreted  functions  [BGV99a,  BGV99b].  Our 
decision  procedure  transforms  Fver  first  by  replacing  all  function  application  terms  with  terms 
over  a  set  of  domain  variables  {u;|l  <  i  <  N}.  Similarly,  all  predicate  applications  are  replaced 
by  formulas  over  a  set  of  newly-generated  propositional  variables.  The  result  is  a  formula  Fyer 
containing  equations  of  the  form  vt  —  vj,  where  1  <  i  <  j  <  N.  Each  of  these  equations  is 
then  encoded  by  introducing  a  relational  variable  e4-j,  similar  to  the  method  proposed  by  Goel,  et 
al.  [GSZAS98].  The  result  of  the  translation  is  a  propositional  formula  enc/(Fyer)  expressing  the 
verification  condition  over  both  the  relational  variables  and  the  propositional  variables  appearing 
in  Fyer.  Let  Fsat  denote  ->enc/(Fyer),  the  complement  of  the  formula  expressing  the  translated 
verification  condition.  To  capture  the  transitivity  of  equality,  e.g.,  that  u,-  =  v3  A  v3  —Vk  =£  V{  =  Vk, 
we  have  transitivity  constraints  of  the  form  e^j]  A  =x  e^*].  Finding  a  satisfying  assignment 
to  Fsat  that  also  satisfies  the  transitivity  constraints  will  give  us  a  counterexample  to  the  original 
verification  condition  FVer-  On  the  other  hand,  if  we  can  prove  that  there  are  no  such  assignments, 
then  we  have  proved  that  Fyer  is  universally  valid. 

We  consider  three  methods  to  generate  a  Boolean  formula  Ftrans  that  encodes  the  transitivity 
constraints.  The  direct  method  enumerates  the  set  of  chord-free  cycles  in  the  undirected  graph 
having  an  edge  (i,j)  for  each  relational  variable  eSiJ  €  £.  This  method  avoids  introducing  addi¬ 
tional  relational  variables  but  can  lead  to  a  formula  of  exponential  size.  The  dense  method  uses 
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relational  variables  e,-j  for  all  possible  values  of  i  and  j  such  that  1  <  i  <  j  <  AA  We  can  then 
axiomatize  transitivity  by  forming  constraints  of  the  form  epj]  A  for  all  distinct  values 

of  i,  j,  and  k.  This  will  yield  a  formula  that  is  cubic  in  N.  The  sparse  method  augments  £  with 
additional  relational  variables  to  form  a  set  of  variables  £+ ,  such  that  the  resulting  graph  is  chordal 
[Rose70].  We  then  only  require  transitivity  constraints  of  the  form  epj]  A  ey^]  =>  such  that 
e[*',ib  e[j,k]->  eVM  e  •  The  sparse  method  is  guaranteed  to  generate  a  smaller  formula  than  the 
dense  method. 

To  use  a  conventional  Boolean  Satisfiability  (SAT)  procedure  to  solve  our  constrained  satisfia¬ 
bility  problem,  we  run  the  checker  over  a  set  of  clauses  encoding  both  Fsat  and  Ttrans.  The  latest 
version  of  the  FGRASP  SAT  checker  [M99]  was  able  to  complete  all  of  our  benchmarks,  although 
the  run  times  increase  significantly  when  transitivity  constraints  are  enforced. 

When  using  Ordered  Binary  Decision  Diagrams  to  evaluate  satisfiability,  we  could  generate 
OBDD  representations  of  Fs at  and  ftrans  ar>d  use  the  apply  algorithm  to  compute  an  OBDD 
representation  of  their  conjunction.  From  this  OBDD,  finding  satisfying  solutions  would  be  trivial. 
We  show  that  this  approach  will  not  be  feasible  in  general,  because  the  OBDD  representation  of 
T’trans  can  be  intractable.  That  is,  for  some  sets  of  relational  variables,  the  OBDD  representation 
of  the  transitivity  constraint  formula  F{rans  will  be  of  exponential  size  regardless  of  the  variable 
ordering.  The  NP-completeness  result  of  Goel,  et  al.  shows  that  the  OBDD  representation  of 
Ftrans  may  be  of  exponential  size  using  the  ordering  previously  selected  for  representing  Fsat 
as  an  OBDD.  This  leaves  open  the  possibility  that  there  could  be  some  other  variable  ordering 
that  would  yield  efficient  OBDD  representations  of  both  Fsat  and  Ftrans.  Our  result  shows  that 
transitivity  constraints  can  be  intrinsically  intractable  to  represent  with  OBDDs,  independent  of 
the  structure  of  /’sat- 

We  present  experimental  results  on  the  complexity  of  constructing  OBDDs  for  the  transitivity 
constraints  that  arise  in  actual  microprocessor  verification.  Our  results  show  that  the  OBDDs  can 
indeed  be  quite  large.  We  consider  two  techniques  to  avoid  constructing  the  OBDD  representation 
of  all  transitivity  constraints.  The  first  of  these,  proposed  by  Goel,  et  al.  [GSZAS98],  generates 
implicants  (cubes)  of  Fsat  and  rejects  those  that  violate  the  transitivity  constraints.  Although  this 
method  suffices  for  small  benchmarks,  we  find  that  the  number  of  implicants  generated  for  our 
larger  benchmarks  grows  unacceptably  large.  The  second  method  determines  which  relational 
variables  actually  occur  in  the  OBDD  representation  of  Fsat.  We  can  then  apply  one  of  our  three 
techniques  for  encoding  the  transitivity  constraints  in  order  to  generate  a  Boolean  formula  for  the 
transitivity  constraints  over  this  reduced  set  of  relational  variables.  The  OBDD  representation  of 
this  formula  is  generally  tractable,  even  for  the  larger  benchmarks. 


2  Benchmarks 

Our  benchmarks  [VB99]  are  based  on  applying  our  verifier  to  a  set  of  high-level  microprocessor 
designs.  Each  is  based  on  the  DLX  RISC  processor  described  by  Hennessy  and  Patterson  [HP96]: 

lxDLX-C  :  is  a  single-issue,  five-stage  pipeline  capable  of  fetching  up  to  one  new  instruction 
every  clock  cycle.  It  implements  six  instruction  types:  register-register,  register-immediate, 
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Circuit 

Domain 

Variables 

Propositional 

Variables 

Equations 

lxDLX-C 

13 

42 

27 

1  xDLX-C-t 

13 

42 

37 

2xDLX-CA 

25 

58 

118 

2xDLX-CA-t 

25 

58 

137 

2xDLX-CC 

25 

70 

124 

2xDLX-CC-t 

25 

70 

143 

Buggy 

min. 

22 

56 

89 

2xDLX-CC 

avg. 

25 

69 

124 

max. 

25 

77 

132 

Table  1 :  Microprocessor  Verification  Benchmarks.  Benchmarks  with  suffix  “t”  were  modified 
to  require  enforcing  transitivity. 


load,  store,  branch,  and  jump.  The  pipeline  stages  are:  Fetch,  Decode,  Execute,  Memory, 
and  Write-Back.  An  interlock  causes  the  instruction  following  a  load  to  stall  one  cycle  if 
it  requires  the  loaded  result.  Branches  and  jumps  are  predicted  as  not-taken,  with  up  to  3 
instructions  squashed  when  there  is  a  misprediction.  This  example  is  comparable  to  the  DLX 
example  first  verified  by  Burch  and  Dill  [BD94]. 

2xDLX-CA:  has  a  complete  first  pipeline,  capable  of  executing  the  six  instruction  types,  and 
a  second  pipeline  capable  of  executing  arithmetic  instructions.  Between  0  and  2  new 
instructions  are  issued  on  each  cycle,  depending  on  their  types  and  source  registers,  as  well  as 
the  types  and  destination  registers  of  the  preceding  instructions.  This  example  is  comparable 
to  one  verified  by  Burch  [Bur96]. 

2xDLX-CC:  has  two  complete  pipelines,  i.e.,  each  can  execute  any  of  the  six  instruction  types. 
There  are  four  load  interlocks — between  a  load  in  Execute  in  either  pipeline  and  an  instruc¬ 
tion  in  Decode  in  either  pipeline.  On  each  cycle,  between  0  and  2  instructions  can  be  issued. 

In  all  of  these  examples,  the  domain  variables  u;,  with  1  <  i  <  N,  in  Fyer  encode  register 
identifiers.  As  described  in  [BGV99a,  BGV99b],  we  can  encode  the  symbolic  terms  representing 
program  data  and  addresses  as  distinct  values,  avoiding  the  need  to  have  equations  among  these 
variables.  Equations  arise  in  modeling  the  read  and  write  operations  of  the  register  file,  the  bypass 
logic  implementing  data  forwarding,  the  load  interlocks,  and  the  pipeline  issue  logic. 

Our  original  processor  benchmarks  can  be  verified  without  enforcing  any  transitivity  con¬ 
straints.  The  unconstrained  formula  Fsat  is  unsatisfiable  in  every  case.  We  are  nonetheless  mo¬ 
tivated  to  study  the  problem  of  constrained  satisfiability  for  two  reasons.  First,  other  processor 
designs  might  rely  on  transitivity,  e.g.,  due  to  more  sophisticated  issue  logic.  Second,  to  aid  de¬ 
signers  in  debugging  their  pipelines,  it  is  essential  that  we  generate  counterexamples  that  satisfy 
all  transitivity  constraints.  Otherwise  the  designer  will  be  unable  to  determine  whether  the  coun¬ 
terexample  represents  a  true  bug  or  a  weakness  of  our  verifier. 
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To  create  more  challenging  benchmarks,  we  generated  variants  of  the  circuits  that  require  en¬ 
forcing  transitivity  in  the  verification.  For  example,  the  normal  forwarding  logic  in  the  Execute 
stage  of  1  xDLX-C  must  determine  whether  to  forward  the  result  from  the  Memory  stage  instruc¬ 
tion  as  either  one  or  both  operand(s)  for  the  Execute  stage  instruction.  It  does  this  by  comparing  the 
two  source  registers  ESrcl  and  ESrc2  of  the  instruction  in  the  Execute  stage  to  the  destination 
register  MDest  of  the  instruction  in  the  memory  stage.  In  the  modified  circuit,  we  changed  the  by¬ 
pass  condition  ESrcl  =  MDest  to  be  ESrcl  =  MDest  V(ESrcl  =  ESrc2AESrc2  =  MDest). 
Given  transitivity,  these  two  expressions  are  equivalent.  For  each  pipeline,  we  introduced  four 
such  modifications  to  the  forwarding  logic,  with  different  combinations  of  source  and  destination 
registers.  These  modified  circuits  are  named  1  xDLX-C-t,  2xDLX-CA-t,  and  2xDLX-CC-t. 

To  study  the  problem  of  counterexample  generation  for  buggy  circuits,  we  generated  105  vari¬ 
ants  of  2xDLX-CC,  each  containing  a  small  modification  to  the  control  logic.  Of  these,  5  were 
found  to  be  functionally  correct,  e.g.,  because  the  modification  caused  the  processor  to  stall  un¬ 
necessarily,  yielding  a  total  of  100  benchmark  circuits  for  counterexample  generation. 

Table  1  gives  some  statistics  for  the  benchmarks.  The  number  of  domain  variables  N  ranges 
between  13  and  25,  while  the  number  of  equations  ranges  between  27  and  143.  The  verification 
condition  formulas  Fyer  also  contain  between  42  and  77  propositional  variables  expressing  the 
operation  of  the  control  logic.  These  variables  plus  the  relational  variables  comprise  the  set  of 
variables  V  in  the  propositional  formula  ^sat-  The  circuits  with  modifications  that  require  en¬ 
forcing  transitivity  yield  formulas  containing  up  to  19  additional  equations.  The  final  three  lines 
summarize  the  complexity  of  the  100  buggy  variants  of  2xDLX-CC.  We  apply  a  number  of  sim¬ 
plifications  during  the  generation  of  formula  Fsea,  and  hence  small  changes  in  the  circuit  can  yield 
significant  variations  in  the  formula  complexity. 


3  Graph  Formulation 

Our  definition  of  Trans(S)  (Equation  1)  places  no  restrictions  on  the  length  or  form  of  the  tran¬ 
sitivity  constraints,  and  hence  there  can  be  an  infinite  number.  We  show  that  we  can  construct  a 
graph  representation  of  the  relational  variables  and  identify  a  reduced  set  of  transitivity  constraints 
that,  when  satisfied,  guarantees  that  all  possible  transitivity  constraints  are  satisfied.  By  introduc¬ 
ing  more  relational  variables,  we  can  alter  this  graph  structure,  further  reducing  the  number  of 
transitivity  constraints  that  must  be  considered. 

For  variable  set  S,  define  the  undirected  graph  G(S)  as  containing  a  vertex  i  for  1  <  i  <  N,  and 
an  edge  (i,j)  for  each  variable  eUJ  £  £.  For  an  assignment  x  of  Boolean  values  to  the  relational 
variables,  define  the  labeled  graph  G(£,  x)  to  be  the  graph  G(£)  with  each  edge  (i,j)  labeled  as  a 
1-edge  when  x(ei,j)  =  G  and  as  a  0-edge  when  xiei,j)  =  0. 

A  path  is  a  sequence  of  vertices  [G,i2, . . .  ,4]  having  edges  between  successive  elements. 
That  is,  each  element  ip  of  the  sequence  (1  <  p  <  k)  denotes  a  vertex:  1  <  ip  <  N,  while  each 
successive  pair  of  elements  iv  and  ip+\  (1  <  p  <  k )  forms  an  edge  (ip,  ip+ 1 )  We  consider  each  edge 
(ip,  ip+ 1)  for  1  <  p  <  k  to  also  be  part  of  the  path.  A  cycle  is  a  path  of  the  form  [G,  i2, . . . ,  u-,  i\\. 


4 


Proposition  1  An  assignment  x  to  the  variables  in  £  violates  transitivity  if  and  only  if  some  cycle 
in  G(£ ,  x)  contains  exactly  one  0-edge. 

Proof:  If.  Suppose  there  is  such  a  cycle.  Letting  4  be  the  vertex  at  one  end  of  the  O-edge,  we 
can  trace  around  the  cycle,  giving  a  sequence  of  vertices  [4,  4,  •  •  •  ,  4],  where  4  is  the  vertex  at 
the  other  end  of  the  0-edge.  The  assignment  has  x(e[j>J+d)  =  1  for  1  <  j  <  k,  and  x(e[q,4]  =  0). 
and  hence  it  violates  Equation  1 . 

Only  If.  Suppose  the  assignment  violates  a  transitivity  constraint  given  by  Equation  1.  Then, 
we  construct  a  cycle  [4,4,  •  •  • ,  4, 4]  of  vertices  such  that  only  edge  (4, 4)  is  a  0-edge.  □ 

A  path  [4, 4,..., 4]  is  said  to  be  acyclic  when  ip  ^  for  all  1  <  p  <  q  <  k.  A  cycle 
[4,4,  •  •  • ,  4, 4]  is  said  to  be  simple  when  its  prefix  [4,4,  •  •  • ,  4]  is  acyclic. 

Proposition  2  An  assignment  x  to  the  variables  in  £  violates  transitivity  if  and  only  if  some  simple 
cycle  in  G(£,  x)  contains  exactly  one  0-edge. 

Proof:  The  “if”  portion  of  this  proof  is  covered  by  Proposition  1.  The  “only  if”  portion  is 
proved  by  induction  on  the  number  of  variables  in  the  antecedent  of  the  transitivity  constraint 
(Equation  1.)  That  is,  assume  a  transitivity  constraint  containing  k  variables  in  the  antecedent  is 
violated  and  that  all  other  violated  constraints  have  at  least  k  variables  in  their  antecedents.  If  there 
are  no  values  p  and  q  such  that  1  <  p  <  q  <  k  and  ip  =  iq,  then  the  cycle  [4,4,  •  •  •  4, 4]  is  simple. 
If  such  values  p  and  q  exist,  then  we  can  form  a  transitivity  constraint: 

e[iui2]  A  •  •  •  A  e[ip_uip]  A  e[iqtiq+l]  A  •  •  •  A  e[ik_uik]  =>  e[iuik] 

This  transitivity  constraint  contains  fewer  than  k  variables  in  the  antecedent,  but  it  is  also  violated. 
This  contradicts  our  assumption  that  there  is  no  violated  transitivity  constraint  with  fewer  than  k 
variables  in  the  antecedent.  □ 

Define  a  chord  of  a  simple  cycle  to  be  an  edge  that  connects  two  vertices  that  are  not  adjacent 
in  the  cycle.  More  precisely,  for  a  simple  cycle  [4, 4,  •  ■  • , 4, 4],  a  chord  is  an  edge  (4,  iq)  in 
G(£)  such  that  1  <  p  <  q  <  k,  with  p  +  1  <  q ,  and  either  p  f  l  ox  q  f  k.  A  cycle  is  said  to  be 
chord-free  if  it  is  simple  and  has  no  chords. 

Proposition  3  An  assignment  x  to  the  variables  in  £  violates  transitivity  if  and  only  if  some  chord- 
free  cycle  in  G(£,  x)  contains  exactly  one  0-edge. 

Proof:  The  “if”  portion  of  this  proof  is  covered  by  Proposition  1 .  The  “only  if”  portion  is 
proved  by  induction  on  the  number  of  variables  in  the  antecedent  of  the  transitivity  constraint 
(Equation  1 .)  Assume  a  transitivity  constraint  with  k  variables  is  violated,  and  that  no  transitivity 
constraint  with  fewer  variables  in  the  antecedent  is  violated.  If  there  are  no  values  of  p  and  q  such 
that  there  is  a  variable  c\ip,iq]  €  £  with  p  + 1  <  q  and  either  p  f  l  ox  q  f  k,  then  the  corresponding 
cycle  is  chord-free.  If  such  values  of  p  and  q  exist,  then  consider  the  two  cases  illustrated  in  Figure 
1,  where  0-edges  are  shown  as  dashed  lines,  1 -edges  are  shown  as  solid  lines,  and  the  wavy  lines 
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Figure  1 :  Case  Analysis  for  Proposition  3.  0-Edges  are  shown  as  dashed  lines.  When  a  cycle  rep¬ 
resenting  a  transitivity  violation  contains  a  chord,  we  can  find  a  smaller  cycle  that  also  represents 
a  transitivity  violation. 

represent  sequences  of  1 -edges.  Case  1:  Edge  (ip,  iq)  is  a  0-edge  (shown  on  the  left).  Then  the 
transitivity  constraint: 


e[wP+il  A  •••  Aeh,-.,i»]  =*  el ip,i9J 

is  violated  and  has  fewer  than  k  variables  in  its  antecedent.  Case  2:  Edge  ( ip ,  iq)  is  a  1-edge  (shown 
on  the  right).  Then  the  transitivity  constraint: 

e[iui2]  A  •  •  •  A  e[ip_ltip]  A  e[ip,iq]  A  e[iq}iq+l]  A  •  •  •  A  e[lk_uik]  =>  e[il>ik] 

is  violated  and  has  fewer  than  k  variables.  Both  cases  contradict  our  assumption  that  there  is  no 
violated  transitivity  constraint  with  fewer  than  k  variables  in  the  antecedent.  □ 

Each  length  k  cycle  [4, 45 . . . ,  4, 4]  yields  k  constraints,  given  by  the  following  clauses.  Each 
clause  is  derived  by  expressing  Equation  1  as  a  disjunction. 

^e[M,«2]  v  •  •  •  V  ,*'*]  V 

-ie[«2,«3]  V  •  •  •  V  —i ,C[ik_l,ik]  V  V  £[iui2]  (2^ 

_iel v  ~"e[n,A]  V  •  •  •  V  ^e[4_2i4_,]  V  e[ik_uik] 

For  a  set  of  relational  variables  £,  we  define  Ftrans(^)  t0  ^e  the  conjunction  of  all  transitivity 
constraints  for  all  chord-free  cycles  in  the  graph  G(S). 

Theorem  1  An  assignment  to  the  relational  variables  £  will  satisfy  all  of  the  transitivity  con¬ 
straints  given  by  Equation  1  if  and  only  if  it  satisfies  Ftrans{£). 

This  theorem  follows  directly  from  Proposition  3  and  the  encoding  given  by  Equation  2. 


3.1  Enumerating  Chord-Free  Cycles 

To  enumerate  the  chord-free  cycles  of  a  graph,  we  exploit  the  following  properties.  An  acyclic  path 
[4,4, . . . ,  4]  is  said  to  have  a  chord  when  there  is  an  edge  (ip:  iq)  in  G(£ )  such  that  1  <  p  <  q  <  k 
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with  p  +  1  <  q,  and  either  p  -f  1  or  q  ±  k.  We  classify  a  chord-free  path  as  terminal  when  (ik,  4) 
is  in  G(£),  and  as  extensible  otherwise. 

Proposition  4  A  path  [4,  4,  ■  ■  ■ ,  4]  is  chord-free  and  terminal  if  and  only  if  the  cycle  [4,4, . . . ,  4, 4] 
is  chord- free. 

This  follows  by  noting  that  the  conditions  imposed  on  a  chord-free  path  are  identical  to  those  for  a 
chord-free  cycle,  except  that  the  latter  includes  a  closing  edge  (4, 4)» 

A  proper  prefix  of  path  [4,4, . . . ,  4]  is  a  path  [4,4,  •  •  • ,  4']  suc^  that  1  <  j  <  k. 

Proposition  5  Every  proper  prefix  of  a  chord-free  path  is  chord-free  and  extensible. 

Clearly,  any  prefix  of  a  chord-free  path  is  also  chord-free.  If  some  prefix  [4,4,  •  •  • ,  4]  w'th 
j  <  k  were  terminal,  then  any  attempt  to  add  the  edge  (4, 4+0  would  yield  either  a  simple  cycle 
(when  4+ i  =  4),  some  other  cycle  (when  4+1  =  4  for  some  1  <  p  <  j),  or  a  path  having  (4,4) 
as  a  chord. 

Given  these  properties,  we  can  enumerate  the  set  of  all  chord-free  paths  by  breadth  first  expan¬ 
sion.  As  we  enumerate  these  paths,  we  also  generate  C,  the  set  of  all  chord-free  cycles.  Define  Pk 
to  be  the  set  of  all  extensible,  chord-free  paths  having  k  vertices,  for  1  <  k  <  N. 

Initially  we  have  Pi  =  {[i]|l  <  %  <  n},  and  C  =  0.  Given  set  Pk,  we  generate  set  Pk+i  and 
add  some  cycles  of  length  k  +  1  to  C.  For  each  path  [4, 4,  ■  •  • ,  4]  £  Pk,  we  consider  the  path 
[4,4,  •  •  •  ,4,4+i]  for  each  edge  (4,4+i)  in  G(£).  When  4+i  =  4  f°r  some  1  <  P  <  k,  we 
classify  the  path  as  cyclic.  When  there  is  an  edge  (4+i,4)  G(£)  for  some  1  <  p  <  k,  we 

classify  the  path  as  having  a  chord.  When  there  is  an  edge  (4+i,4)  in  G(£),  we  add  the  cycle 
[4,4,  •  ■  • ,  4, 4+i,4]  to  C.  Otherwise,  we  add  the  path  to  Pk+\. 

After  generating  all  of  these  paths,  we  can  use  the  set  C  to  generate  the  set  of  all  chord-free 
cycles.  For  each  terminal,  chord-free  cycle  having  k  vertices,  there  will  be  2k  members  of  C — 
each  of  the  k  edges  of  the  cycle  can  serve  as  the  closing  edge,  and  a  cycle  can  traverse  the  closing 
edge  in  either  direction.  To  generate  the  set  of  clauses  given  by  Equation  2,  we  simply  need  to 
choose  one  element  of  C  for  each  closing  edge,  e.g.,  by  considering  only  cycles  [4,  •  •  • ,  4, 4]  for 
which  4  <  4- 

As  Figure  2  indicates,  there  can  be  an  exponential  number  of  chord-free  cycles  in  a  graph. 

In  particular,  this  figure  illustrates  a  family  of  graphs  with  3n  +  1  vertices.  Consider  the  cycles 
passing  through  the  n  diamond-shaped  faces  as  well  as  the  edge  along  the  bottom.  For  each 
diamond-shaped  face  Ft,  a  cycle  can  pass  through  either  the  upper  vertex  or  the  lower  vertex.  Thus 
there  are  2n  such  cycles.  In  addition,  the  edges  forming  the  perimeter  of  each  face  Ft  create  a 
chord-free  cycle,  giving  a  total  of  2"  +  n  chord-free  cycles. 

The  columns  labeled  “Direct”  in  Table  2  show  results  for  enumerating  the  chord-free  cycles  for 
our  benchmarks.  For  each  correct  microprocessor,  we  have  two  graphs:  one  for  which  transitivity 
constraints  played  no  role  in  the  verification,  and  one  (indicated  with  a  “t”  at  the  end  of  the  name) 
modified  to  require  enforcing  transitivity  constraints.  We  summarize  the  results  for  the  transitivity 
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Figure  2:  Class  of  Graphs  with  Many  Chord-Free  Cycles.  For  a  graph  with  n  diamond-shaped 
faces,  there  are  2”  +  n  chord-free  cycles. 


Circuit 

Edges 

Direct 

Cycles 

Clauses 

Edges 

Dense 

Cycles 

Clauses 

Edges 

Sparse 

Cycles 

Clauses 

lxDLX-C 

27 

90 

360 

78 

286 

858 

33 

40 

120 

1  x  DLX-C-t 

37 

95 

348 

78 

286 

858 

42 

68 

204 

2xDLX-CA 

118 

2,393 

9,572 

300 

2,300 

6,900 

172 

697 

2,091 

2xDLX-CA-t 

137 

1,974 

7,944 

300 

2,300 

6,900 

178 

695 

2,085 

2xDLX-CC 

124 

2,567 

10,268 

300 

2,300 

6,900 

182 

746 

2,238 

2xDLX-CC-t 

143 

2,136 

8,364 

300 

2,300 

6,900 

193 

858 

2,574 

Full 

min. 

89 

1,446 

6,360 

231 

1,540 

4,620 

132 

430 

1,290 

Buggy 

avg. 

124 

2,562 

10,270 

300 

2,300 

6,900 

182 

750 

2,244 

2xDLX-CC 

max. 

132 

3,216 

12,864 

299 

2,292 

6,877 

196 

885 

2,655 

Ma 

24 

24 

192 

120 

560 

1,680 

42 

44 

132 

M5 

40 

229 

3,056 

300 

2,300 

6,900 

77 

98 

294 

Me 

60 

3,436 

61,528 

630 

7,140 

21,420 

131 

208 

624 

m7 

84 

65,772 

1,472,184 

1,176 

18,424 

55,272 

206 

408 

1,224 

Ms 

112 

1,743,247 

48,559,844 

2,016 

41,664 

124,992 

294 

662 

1,986 

Table  2:  Cycles  in  Original  and  Augmented  Benchmark  Graphs.  Results  are  given  for  the  three 
different  methods  of  encoding  transitivity  constraints. 


constraints  in  our  100  buggy  variants  of  2xDLX-CC  in  terms  of  the  minimum,  the  average,  and 
the  maximum  of  each  measurement.  We  also  show  results  for  five  synthetic  benchmarks  consisting 
of  n  x  n  planar  meshes  Mn,  with  n  ranging  from  4  to  8,  where  the  mesh  for  n  =  6  is  illustrated 
in  Figure  3.  For  all  of  the  circuit  benchmarks,  the  number  of  cycles,  although  large,  appears  to  be 
manageable.  Moreover,  the  cycles  have  at  most  4  edges.  The  synthetic  benchmarks,  on  the  other 
hand,  demonstrate  the  exponential  growth  predicted  as  worst  case  behavior.  The  number  of  cycles 
grows  quickly  as  the  meshes  grow  larger.  Furthermore,  the  cycles  can  be  much  longer,  causing  the 
number  of  clauses  to  grow  even  more  rapidly. 
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3.2  Adding  More  Relational  Variables 


Enumerating  the  transitivity  constraints  based  on  the  variables  in  £  runs  the  risk  of  generating  a 
Boolean  formula  of  exponential  size.  We  can  guarantee  polynomial  growth  by  considering  a  larger 
set  of  relational  variables.  In  general,  let  £'  be  some  set  of  relational  variables  such  that  £  C  £', 
and  let  Ptrans^O  be  the  transitivity  constraint  formula  generated  by  enumerating  the  chord-free 
cycles  in  the  graph  G(£'). 

Theorem  2  If  £  is  the  set  of  relational  variables  in  Fsat  and  £  C  £',  then  the  formula  Fsat  A 
Ftrans(£ )  w  satis  fable  if  and  only  if  Fsaf  A  Ffrans(£')  is  satis  fable. 

We  introduce  a  series  of  lemmas  to  prove  this  theorem.  For  a  propositional  formula  F  over  a 
set  of  variables  A  and  an  assignment  x:  A  — >  {0, 1},  define  the  valuation  of  F  under  x,  denoted 
[F]  ,  to  be  the  result  of  evaluating  formula  F  according  to  assignment  x-  We  first  prove  that  we 
can  extend  any  assignment  over  a  set  of  relational  variables  to  one  over  a  superset  of  these  variables 
yielding  identical  valuations  for  both  transistivity  constraint  formulas. 


Lemma  1  For  any  sets  of  relational  variables  £\  and  £2  such  that  £\  C  £2,  and  for  any  assignment 
Xi- £\  — »  {0, 1},  such  that  {Ffrans{£ i)]  =  1,  there  is  an  assignment  \2-  £2  — *  {0, 1}  such  that 
[Ftrans(^2)]X2  =  1 


Proof:  We  consider  the  case  where  £2  =  £1  U  {e,j}.  The  general  statement  of  the  proposition 
then  holds  by  induction  on  \£2\  —  \£i\. 

Define  assignment  \2  to  be: 


X2(e) 


'  Xi(e),  e  7^  eh3 

<  1,  Graph  G(£’i,  x)  has  a  path  of  1 -edges  from  node  i  to  node  j. 
0,  otherwise 


We  consider  two  cases: 


1.  If  X2 (ejj)  =  0,  then  any  cycle  in  G(£2,x 2)  through  must  contain  a  0-edge  other  than 
e,  j.  Hence  adding  this  edge  does  not  introduce  any  transitivity  violations. 

2.  If  X2 (e,j)  =  1,  then  there  must  be  some  path  Pi  of  l-edges  between  nodes  i  and  j  in 
G(£ 1,  Xi)-  In  order  for  the  introduction  of  1-edge  to  create  a  transitivity  violation,  there 
must  also  be  some  path  P2  between  nodes  i  and  j  in  G(£ i,Xi)  containing  exactly  one  fl¬ 
edge.  But  then  we  could  concatenate  paths  Pi  and  P2  to  form  a  cycle  in  G(£i ,  xi )  containing 
exactly  one  0-edge,  implying  that  [Ptrans(^i)]Xl  =  0-  We  conclude  therefore  that  adding 
l-edge  et  J-  does  not  introduce  any  transitivity  violations. 


Lemma  2  For  £\  C  £2  and  for  any  assignment  \2.  £2  —>  {0, 1},  such  that  [Ftrans{£ 2)]X2  =  L  we 
also  have  [Ftrans{£ i)]X2  =  1 
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Proof:  We  note  that  any  cycle  in  G(£ i,X2)  must  be  present  in  G'(£2,  X2)  and  have  the  same 
edge  labeling.  Thus,  if  G'(£2,  X2)  has  no  cycle  with  a  single  0-edge,  then  neither  does  G'(£i,X2). 
□ 

We  now  return  to  the  proof  of  Theorem  2. 

Proof:  Suppose  that  ^sat  A  ( £*J  is  satisfiable,  i.e.,  there  is  some  assignment  y  such 

that  [FsaJ  =  [^trans(£)]x  =  1-  Then  by  Lemma  1  we  can  find  an  assignment  x'  such  that 
[Ftrans(^/)lx'  =  !•  Furthermore,  since  the  construction  of  by  Lemma  1  preserves  the  values 
assigned  to  all  variables  in  £,  and  these  are  the  only  relational  variables  occurring  in  Fsat,  we  can 
conclude  that  [Fsat]  ,  =  1.  Therefore  Fsat  A  Ftrans(£')  is  satisfiable. 

Suppose  on  the  other  hand  that  Fsat  A  Ftrans(£')  is  satisfiable,  i.e.,  there  is  some  assignment 
X*  such  that  [Fsat]x,  =  [i'trans(£,)]x'  =  1-  Then  by  Lemma  2  we  also  have  [Ftrans(£)]x,  =  1,  and 
hence  Fsat  A  Ftrans(£)  is  satisfiable.  □ 

Our  goal  then  is  to  add  as  few  relational  variables  as  possible  in  order  to  reduce  the  size  of 
the  transitivity  formula.  We  will  continue  to  use  our  path  enumeration  algorithm  to  generate  the 
transitivity  formula. 

3.3  Dense  Enumeration 

For  the  dense  enumeration  method,  let  £n  denote  the  set  of  variables  e,j  for  all  values  of  i  and 
j  such  that  1  <  i  <  j  <  N.  Graph  G(£n)  is  a  complete,  undirected  graph.  In  this  graph, 
any  cycle  of  length  greater  than  three  must  have  a  chord.  Hence  our  algorithm  will  enumerate 
transitivity  constraints  of  the  form  e[,-j]  A  e[i,k\,  for  all  distinct  values  of  i,  j,  and  k. 

The  graph  has  N(N  —  1)  edges  and  N(N  —  1  )(N  —  2)/6  chord-free  cycles,  yielding  a  total  of 
N(N  —  1  )(N  —  2)/2  =  0(N3)  transitivity  constraints. 

The  columns  labeled  “Dense”  in  Table  2  show  the  complexity  of  this  method  for  the  benchmark 
circuits.  For  the  smaller  graphs  1  xDLX-C,  1  xDLX-C-t,  M4  and  M5,  this  method  yields  more 
clauses  than  direct  enumeration  of  the  cycles  in  the  original  graph.  For  the  larger  graphs,  however, 
it  yields  fewer  clauses.  The  advantage  of  the  dense  method  is  most  evident  for  the  mesh  graphs, 
where  the  cubic  complexity  is  far  superior  to  exponential. 


3.4  Sparse  Enumeration 

We  can  improve  on  both  of  these  methods  by  exploiting  the  sparse  structure  of  G(£).  Like  the 
dense  method,  we  want  to  introduce  additional  relational  variables  to  give  a  set  of  variables  £+ 
such  that  the  resulting  graph  G(£+  )  becomes  chordal  [Rose70].  That  is,  the  graph  has  the  property 
that  every  cycle  of  length  greater  than  three  has  a  chord. 

Chordal  graphs  have  been  studied  extensively  in  the  context  of  sparse  Gaussian  elimination.  In 
fact,  the  problem  of  finding  a  minimum  set  of  additional  variables  to  add  to  our  set  is  identical  to 
the  problem  of  finding  an  elimination  ordering  for  Gaussian  elimination  that  minimizes  the  amount 
of  fill-in.  Although  this  problem  is  NP-complete  [Yan81],  there  are  good  heuristic  solutions.  In 
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Circuit 

Qsat 

Satisfiable? 

Secs. 

Qrans  U  Qat 
Satisfiable?  Secs. 

Ratio 

lxDLX-C 

N 

3 

N 

4 

1.4 

1  xDLX-C-t 

Y 

1 

N 

9 

N.A. 

2xDLX-CA 

N 

176 

N 

1,275 

7.2 

2xDLX-CA-t 

Y 

3 

N 

896 

N.A. 

2xDLX-CC 

N 

5,035 

N 

9,932 

2.6 

2xDLX-CC-t 

Y 

4 

N 

15,003 

N.A. 

Full 

min. 

Y 

1 

Y 

1 

0.2 

Buggy 

avg. 

Y 

125 

Y 

1,517 

2.3 

2xDLX-CC 

max. 

Y 

2,186 

Y 

43,817 

69.4 

Table  3:  Performance  of  fgrasp  on  Benchmark  Circuits.  Results  are  given  both  without  and 
with  transitivity  constraints. 

particular,  our  implementation  proceeds  as  a  series  of  elimination  steps.  On  each  step,  we  remove 
some  vertex  i  from  the  graph.  For  every  pair  of  distinct,  uneliminated  vertices  j  and  k  such  that 
the  graph  contains  edges  (i,j)  and  (i,  k),  we  add  an  edge  (j,  k )  if  it  does  not  already  exist.  The 
original  graph  plus  all  of  the  added  edges  then  forms  a  chordal  graph.  To  choose  which  vertex  to 
eliminate  on  a  given  step,  our  implementation  uses  the  simple  heuristic  of  choosing  the  vertex  with 
minimum  degree.  If  more  than  one  vertex  has  minimum  degree,  we  choose  one  that  minimizes  the 
number  of  new  edges  added. 

The  columns  in  Table  2  labeled  “Sparse”  show  the  effect  of  making  the  benchmark  graphs 
chordal  by  this  method.  Observe  that  this  method  gives  superior  results  to  either  of  the  other  two 
methods.  In  our  implementation  we  have  therefore  used  the  sparse  method  to  generate  all  of  the 
transitivity  constraint  formulas. 


4  SAT-Based  Decision  Procedures 

Most  Boolean  satisfiability  (SAT)  checkers  take  as  input  a  formula  expressed  in  clausal  form. 
Each  clause  is  a  set  of  literals,  where  a  literal  is  either  a  variable  or  its  complement.  A  clause 
denotes  the  disjunction  of  its  literals.  The  task  of  the  checker  is  to  either  find  an  assignment  to  the 
variables  that  satisfies  all  of  the  clauses  or  to  determine  that  no  such  assignment  exists.  We  can 
solve  the  constrained  satisfiability  problem  using  a  conventional  SAT  checker  by  generating  a  set 
of  clauses  Qrans  representing  ^trans(^+ )  a°d  a  set  of  clauses  Csat  representing  the  formula  -^sat- 
We  then  run  the  checker  on  the  combined  clause  set  Csat  U  Qrans  t0  find  satisfying  solutions  to 

^sat  A  ^trans(£+)- 

In  experimenting  with  a  number  of  Boolean  satisfiability  checkers,  we  have  found  that  FGRASP 
[MS99]  has  the  best  overall  performance.  The  most  recent  version  can  be  directed  to  periodically 
restart  the  search  using  a  randomly-generated  variable  assignment  [M99].  This  is  the  first  SAT 
checker  we  have  tested  that  can  complete  all  of  our  benchmarks.  All  of  our  experiments  were 
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conducted  on  a  336  MHz  Sun  UltraSPARC  II  with  1.2GB  of  primary  memory. 

As  indicated  by  Table  3,  we  ran  FGRASP  on  clause  sets  Qat  and  Qrans  U  Csat,  i.e.,  both  with¬ 
out  and  with  transitivity  constraints.  For  benchmarks  1  xDLX-C,  2xDLX-CA,  and  2xDLX-CC, 
the  formula  Fsat  is  unsatisfiable.  As  can  be  seen,  including  transitivity  constraints  increases  the 
run  time  significantly.  For  benchmarks  1  xDLX-C-t,  2xDLX-CA-t,  and  2xDLX-CC-t,  the  for¬ 
mula  Fsat  is  satisfiable,  but  only  because  transitivity  is  not  enforced.  When  we  add  the  clauses 
for  Firans,  the  formula  becomes  unsatisfiable.  For  the  buggy  circuits,  the  run  times  for  Csat  range 
from  under  1  second  to  over  36  minutes.  The  run  times  for  Qrans  *-*  Qat  range  from  less  than 
one  second  to  over  12  hours.  In  some  cases,  adding  transitivity  constraints  actually  decreased  the 
CPU  time  (by  as  much  as  a  factor  of  5),  but  in  most  cases  the  CPU  time  increased  (by  as  much  as  a 
factor  of  69).  On  average  (using  the  geometric  mean)  adding  transitivity  constraints  increased  the 
CPU  time  by  a  factor  of  2.3.  We  therefore  conclude  that  satisfiability  checking  with  transitivity 
constraints  is  more  difficult  than  conventional  satisfiability  checking,  but  the  added  complexity  is 
not  overwhelming. 


5  OBDD-Based  Decision  Procedures 

A  simple-minded  approach  to  solving  satisfiability  with  transitivity  constraints  using  OBDDs 
would  be  to  generate  separate  OBDD  representations  of  Ftrans  ar|d  Qat-  We  could  then  use 
the  Apply  operation  to  generate  an  OBDD  for  Ftrans  A  Fsat,  and  then  either  find  a  satisfying 
assignment  or  determine  that  the  function  is  unsatisfiable.  We  show  that  for  some  sets  of  relational 
variables  S,  the  OBDD  representation  of  Ftrans(£)  can  be  too  large  to  represent  and  manipulate.  In 
our  experiments,  we  use  the  CUDD  OBDD  package  with  dynamic  variable  reordering  by  sifting. 


5.1  Lower  Bound  on  the  OBDD  Representation  of  .Ftrans (<?) 

We  prove  that  for  some  sets  £,  the  OBDD  representation  of  Ftrans(£)  may  be  of  exponential 
size  for  all  possible  variable  orderings.  As  mentioned  earlier,  the  NP-completeness  result  proved 
by  Goel,  et  al.  [GSZAS98]  has  implications  for  the  complexity  of  representing  Ftrans(£)  as  an 
OBDD.  They  showed  that  given  an  OBDD  G'sat  representing  formula  Fsat,  the  task  of  finding 
a  satisfying  assignment  of  Fsat  that  also  satisfies  the  transitivity  constraints  in  Trans{£)  is  NP- 
complete  in  the  size  of  Gsat.  By  this,  assuming  P  F  NP,  we  can  infer  that  the  OBDD  representa¬ 
tion  of  Ftrans(£)  may  be  of  exponential  size  when  using  the  same  variable  ordering  as  is  used  in 
Qsat-  Our  result  extends  this  lower  bound  to  arbitrary  variable  orderings  and  is  independent  of  the 
F  vs.  NP  problem. 

Let  Mn  denote  a  planar  mesh  consisting  of  a  square  array  of  n  x  n  vertices.  For  example, 
Figure  3  shows  the  graph  for  n  =  6.  Being  a  planar  graph,  the  edges  partition  the  plane  into  faces. 
As  shown  in  Figure  3  we  label  these  F,-j  for  1  <  i,j  <  n  —  1.  There  are  a  total  of  ( n  —  l)2 
such  faces.  One  can  see  that  the  set  of  edges  forming  the  border  of  each  face  forms  a  chord-free 
cycle  of  Mn.  As  shown  in  Table  2,  many  other  cycles  are  also  chord-free,  e.g.,  the  perimeter  of 
any  rectangular  region  having  height  and  width  greater  than  1 ,  but  we  will  consider  only  the  cycles 
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Figure  3:  Mesh  Graph  Me. 


corresponding  to  single  faces. 

Define  £nXn  to  be  a  set  of  relational  variables  corresponding  to  the  edges  in  Mn.  Firans(£nXn) 
is  then  an  encoding  of  the  transitivity  constraints  for  these  variables. 

Theorem  3  Any  OBDD  representation  of  Ffrans{£nXn)  must  have  Q(2n/4)  vertices. 

To  prove  this  theorem,  consider  any  ordering  of  the  variables  representing  the  edges  in  Mn. 
Let  A  denote  those  in  the  first  half  of  the  ordering,  and  B  denote  those  in  the  second  half.  We  can 
then  classify  each  face  according  to  the  four  edges  forming  its  border: 

A:  All  are  in  A. 

B:  All  are  in  B. 

C:  Some  are  in  A,  while  others  are  in  B.  These  are  called  “split”  faces. 

Observe  that  we  cannot  have  a  type  A  face  adjacent  to  a  type  B  face,  since  their  shared  edge  cannot 
be  in  both  A  and  B.  Therefore  there  must  be  split  faces  separating  any  region  of  type  A  faces  from 
any  region  of  type  B  faces. 

For  example,  Figure  4  shows  three  possible  partitionings  of  the  edges  of  M6  and  the  resulting 
classification  of  the  faces.  If  we  let  a,  b,  and  c  denote  the  number  of  faces  of  each  respective  type, 
we  see  that  we  always  have  c  >  5  =  n  —  1.  In  particular,  a  minimum  value  for  c  is  achieved 
when  the  partitioning  of  the  edges  corresponds  to  a  partitioning  of  the  graph  into  a  region  of  type 
A  faces  and  a  region  of  type  B  faces,  each  having  nearly  equal  size,  with  the  split  faces  forming 
the  boundary  between  the  two  regions. 
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Figure  4:  Partitioning  Edges  into  Sets  A  (solid)  and  B  (dashed).  Each  face  can  then  be  classified 
as  type  A  (all  solid),  B  (all  dashed),  or  C  (mixed). 

Lemma  3  For  any  partitioning  of  the  edges  of  mesh  graph  Mn  into  equally-sized  sets  A  and  B, 
there  must  be  at  least  (n  —  3)/2  split  faces. 

Note  that  this  lower  bound  is  somewhat  weak — it  seems  clear  that  we  must  have  c  >  n  —  1. 
However,  this  weaker  bound  will  suffice  to  prove  an  exponential  lower  bound  on  the  OBDD  size. 

Proof:  Our  proof  is  an  adaptation  of  a  proof  by  Leighton  [Lei92,  Theorem  1 .2 1  ]  that  Mn  has 
a  bisection  bandwidth  of  at  least  n.  That  is,  one  would  have  to  remove  at  least  n  edges  to  split  the 
graph  into  two  parts  of  equal  size. 

Observe  that  Mn  has  n2  vertices  and  2 n(n  —  1)  edges.  These  edges  are  split  so  that  n(n  —  1) 
are  in  A  and  n(n  —  1)  are  in  B. 

Let  denote  the  planar  dual  of  Mn.  That  is,  it  contains  a  vertex  u{tj  for  each  face  of 
Mn,  and  edges  between  pairs  of  vertices  such  that  the  corresponding  faces  in  Mn  have  a  common 
edge.  In  fact,  one  can  readily  see  that  this  graph  is  isomorphic  to  Mn_i . 

Partition  the  vertices  of  M;f  into  sets  Ua,Ub,  and  Uc  according  to  the  types  of  their  correspond¬ 
ing  faces.  Let  a,  b,  and  c  denote  the  number  of  elements  in  each  of  these  sets.  Each  face  of  Mn  has 
four  bordering  edges,  and  each  edge  is  the  border  of  at  most  two  faces.  Thus,  as  an  upper  bound 
on  a,  we  must  have  4a  <  2 n(n  —  1),  giving  a  <  n(n  —  1  )/2,  and  similarly  for  b.  In  addition,  since 
a  face  of  type  A  cannot  be  adjacent  in  Mn  to  one  of  type  B,  no  vertex  in  Ua  can  be  adjacent  in  M,f 
to  one  in  Ub. 

Consider  the  complete,  directed,  bipartite  graph  having  as  edges  the  set  (Ua  x  Ub)  U  (Ub  x  Ua), 
i.e.,  a  total  of  2ab  edges.  Given  the  bounds:  a  +  b  =  (n  —  l)2  —  c,  a  <  n(n  -  l)/2,  and 
b  <  n(n  —  l)/2,  the  minimum  value  of  2 ab  is  achieved  when  either  a  =  n(n  —  l)/2  and  b  — 
(n  —  l)2  —  (n  —  l)n/2  —  c  =  (n  —  l)(n  —  2)/2  —  c,  or  vice-versa,  giving  a  lower  bound: 

2 ab  >  2[n(n  —  l)/2]  •  [(n  —  l)(n  —  2)/2  —  c] 
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n(n  —  l)2(n  —  2)/2  —  cn(n  —  1) 


We  can  embed  this  bipartite  graph  in  by  forming  a  path  from  vertex  ul)3  to  vertex 
where  either  uitj  e  Ua  and  €  Uu  or  vice-versa.  By  convention,  we  will  use  the  path  that  first 
follows  vertical  edges  to  and  then  follows  horizontal  edges  to  Uj/j/.  We  must  have  at  least 

one  vertex  in  Uc  along  each  such  path,  and  therefore  removing  the  vertices  in  Uc  would  cut  all  2 ab 
paths. 

For  each  vertex  u, jj  e  Uc,  we  can  bound  the  total  number  of  paths  passing  through  it  by 
separately  considering  paths  that  enter  from  the  bottom,  the  top,  the  left,  and  the  right.  For  those 
entering  from  the  bottom,  there  are  at  most  n  —  i  —  1  source  vertices  and  i(n  —  1)  destination 
vertices,  giving  at  most  i(n  —  i  —  l)(n  —  1)  paths.  This  quantity  is  maximized  for  i  =  (n  -  l)/2, 
giving  an  upper  bound  of  (n  —  l)3/ 4.  A  similar  argument  shows  that  there  are  at  most  (n  —  l)3/ 4 
paths  entering  from  the  top  of  any  vertex.  For  the  paths  entering  from  the  left,  there  are  at  most 
(j  —  l)(n  —  1)  source  vertices  and  (n  —  j)  destinations,  giving  at  most  (j  —  l)(n  —  j)(n  —  1) 
paths.  This  quantity  is  maximized  when  j  =  (n  —  l)/2,  giving  an  upper  bound  of  (n  —  l)3/4.  This 
bound  also  holds  for  those  paths  entering  from  the  right.  Thus,  removing  a  single  vertex  would  cut 
at  most  (n  —  l)3  paths. 

Combining  the  lower  bound  on  the  number  of  paths  2 ab,  the  upper  bound  on  the  number  of 
paths  cut  by  removing  a  single  vertex,  and  the  fact  that  we  are  removing  c  vertices,  we  have: 

c(n  —  l)3  >  n{n  —  l)2(n  —  2)  /  2  —  cn(n  —  1) 

c{n  —  l)3, -\- cn  >  n(n  —  l)(n  —  2) / 2 

c(n2  —  n  +  1)  >  n(n  —  l)(n  —  2)/2 

We  can  rewrite  n(n  —  l)(n  —  2)  as  (n2  — n  +  l)(n  — 3)  +  n2  — 2n  +  3.  Observing  that  n2  —  2n  +  3  >  0 
for  all  values  of  n,  we  have: 

c(r?2  —  n  +  1)  >  (n2  —  n  +  l)(n  —  3)/2  +  ( n 2  —  2  n  +  3)/2 

>  (n2  —  n  +  l)(n  —  3)/2 

c  >  (n  —  3)/2 


□ 


A  set  of  faces  is  said  to  be  edge  independent  when  no  two  members  of  the  set  share  an  edge. 

Lemma  4  For  any  partitioning  of  the  edges  of  mesh  graph  Mn  into  equally-sized  sets  A  and  B, 
there  must  be  an  edge-independent  set  of  split  faces  containing  at  least  (n  —  3)/4  elements. 

Proof:  Classify  the  parity  of  face  FhJ  as  “even”  when  i  +  j  is  even,  and  as  “odd”  otherwise. 
Observe  that  no  two  faces  of  the  same  parity  can  have  a  common  edge.  Divide  the  set  of  split 
faces  into  two  subsets:  those  with  even  parity  and  those  with  odd.  Both  of  these  subsets  are  edge 
independent,  and  one  of  them  must  have  at  least  1/2  of  the  elements  of  the  set  of  all  split  faces.  □ 

We  can  now  complete  the  proof  of  Theorem  3  Proof:  Suppose  there  is  an  edge-independent  set 
of  k  split  faces.  For  each  split  face,  choose  one  edge  in  A  and  one  edge  in  B  bordering  that  face. 
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For  each  value  y  €  {0,  l}fc,  define  assignment  a$  (respectively,  j3f),  to  the  variables  representing 
edges  in  A  (resp.,  B )  as  follows.  For  an  edge  e  that  is  not  part  of  any  of  the  k  split  faces,  define 
a$(e)  —  0  (resp.,  f3$(e)  =  0).  For  an  edge  e  that  is  part  of  a  split  face,  but  it  was  not  one  of  the  ones 
chosen  specially,  let  =  1  (resp.,  (3${e)  =  1).  For  an  edge  e  that  is  the  chosen  variable  in  face  i, 
let  a$(e)  —  yi  (resp.,  /%(e)  =  yi).  This  will  give  us  an  assignment  a $  •  /%  to  all  of  the  variables 
that  evaluates  to  1.  That  is,  for  each  independent,  split  face  Ft,  we  will  have  two  1 -edges  when 
yi  =  0  and  four  1 -edges  when  yt  =  1.  All  other  cycles  in  the  graph  will  have  at  least  two  0-edges. 

On  the  other  hand,  for  any  j/,  z  €  {0, 1}A'  such  that  y  ^  z  the  assignment  a#  •  / 3 ?  will  cause  an 
evaluation  to  0,  because  for  any  face  i  where  ^  Z{,  all  but  one  edge  will  be  assigned  value  1 . 
Thus,  the  set  of  assignments  {a$\y  £  {0, 1}*}  forms  an  OBDD  fooling  set,  as  defined  in  [Bry91], 
implying  that  the  OBDD  must  have  at  least  2k  >  2^n~ 3^4  =  fl(2”/4)  vertices.  □ 

We  have  seen  that  adding  relational  variables  can  reduce  the  number  of  cycles  and  therefore 
simplify  the  transitivity  constraint  formula.  This  raises  the  question  of  how  adding  relational  vari¬ 
ables  affects  the  BDD  representation  of  the  transitivity  constraints.  Unfortunately,  the  exponential 
lower  bound  still  holds. 

Corollary  1  For  any  set  of  relational  variables  £  such  that  £nX„  C  £,  any  OBDD  representation 
of  F(rans(£)  must  contain  fi(2n/,s)  vertices. 

The  extra  edges  in  £  introduce  complications,  because  they  create  cycles  containing  edges 
from  different  faces.  As  a  result,  our  lower  bound  is  weaker. 

Define  a  set  of  faces  as  vertex  independent  if  no  two  members  share  a  vertex. 

Lemma  5  For  any  partitioning  of  the  edges  of  mesh  graph  Mn  into  equal-sized  sets  A  and  B, 
there  must  be  a  vertex-independent  set  of  split  faces  containing  at  least  (n  —  3)/8  elements. 

Proof:  Partition  the  set  of  split  faces  into  four  sets:  EE,  EO,  OE,  and  OO,  where  face  Ftj  is 
assigned  to  a  set  according  to  the  values  of  i  and  j : 

EE:  Both  i  and  j  are  even. 

EO:  i  is  even  and  j  is  odd. 

OE:  i  is  odd  and  j  is  even. 

OO:  Both  i  and  j  are  odd. 

Each  of  these  sets  is  vertex  independent.  At  least  one  of  the  sets  must  contain  at  least  1/4  of 
the  elements.  Since  there  are  at  least  (n  —  3)/2  split  faces,  one  of  the  sets  must  contain  at  least 
(n  —  3)/8  vertex-independent  split  faces.  □ 

We  can  now  prove  Corollary  1 . 

Proof:  For  any  ordering  of  the  variables  in  £ ,  partition  them  into  two  sets  A  and  B  such 
that  those  in  A  come  before  those  in  B,  and  such  the  number  of  variables  that  are  in  £nXn  are 
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equally  split  between  A  and  B.  Suppose  there  is  a  vertex-independent  set  of  k  split  faces.  For 
each  value  y  €  {0, 1  }k,  we  define  assignments  a$  to  the  variables  in  A  and  fig  to  the  variables 
in  B.  These  assignments  are  defined  as  they  are  in  the  proof  of  Theorem  3  with  the  addition  that 
each  variable  e,-j  in  £  —  £nXn  is  assigned  value  0.  Consider  the  set  of  assignments  ag  •  fig  for 
all  values  y,z  e  {0,  \}k.  The  only  cycles  in  G(£,  ag  •  j3g )  that  can  have  less  than  two  0-edges 
will  be  those  corresponding  to  the  perimeters  of  split  faces.  As  in  the  proof  of  Theorem  3,  the  set 
{ay\y  £  {0)  1}*}  forms  an  OBDD  fooling  set,  as  defined  in  [Bry91],  implying  that  the  OBDD 
must  have  at  least  2k  >  2 (n~3)/8  =  0(2"/8)  vertices.  □ 

Our  lower  bounds  are  fairly  weak,  but  this  is  more  a  reflection  of  the  difficulty  of  proving 
lower  bounds.  We  have  found  in  practice  that  the  OBDD  representations  of  the  transitivity  con¬ 
straint  functions  arising  from  benchmarks  tend  to  be  large  relative  to  those  encountered  during  the 
evaluation  of  7sat.  For  example,  although  the  OBDD  representation  of  Fivans(£+)  f°r  benchmark 
1  x  DLX-C-t  is  just  2,692  nodes  (a  function  over  42  variables),  we  have  been  unable  to  construct  the 
OBDD  representations  of  this  function  for  either  2xDLX-CA-t  (178  variables)  or  2xDLX-CC-t 
(193  variables)  despite  running  for  over  24  hours. 

5.2  Enumerating  and  Eliminating  Violations 

Goel,  et  al.  [GSZAS98]  proposed  a  method  that  generates  implicants  (cubes)  of  the  function  ^'sat 
from  its  OBDD  representation.  Each  implicant  is  examined  and  discarded  if  it  violates  a  transitivity 
constraint.  In  our  experiments,  we  have  found  this  approach  works  well  for  the  normal,  correctly- 
designed  pipelines  (i.e.,  circuits  1  xDLX-C,  2xDLX-CA,  and  2xDLX-CC)  since  the  formula  Fsai 
is  unsatisfiable  and  hence  has  no  implicants.  For  all  100  of  our  buggy  circuits,  the  first  implicant 
generated  contained  no  transitivity  violation  and  hence  was  a  valid  counterexample. 

For  circuits  that  do  require  enforcing  transitivity  constraints,  we  have  found  this  approach  im¬ 
practical.  For  example,  in  verifying  1  xDLX-C-t  by  this  means,  we  generated  253,216  implicants, 
requiring  a  total  of  35  seconds  of  CPU  time  (vs.  0.2  seconds  for  1  xDLX-C).  For  benchmarks 
2xDLX-CA-t  and  2xDLX-CC-t,  our  program  ran  for  over  24  hours  without  having  generated  all 
of  the  implicants.  By  contrast,  circuits  2xDLX-CA  and  2xDLX-CC  can  be  verified  in  11  and  29 
seconds,  respectively.  Our  implementation  could  be  improved  by  making  sure  that  we  generate 
only  implicants  that  are  irredundant  and  prime.  In  general,  however,  we  believe  that  a  verifier  that 
generates  individual  implicants  will  not  be  very  robust.  The  complex  control  logic  for  a  pipeline 
can  lead  to  formulas  Fssa  containing  very  large  numbers  of  implicants,  even  when  transitivity  plays 
only  a  minor  role  in  the  correctness  of  the  design. 


5.3  Enforcing  a  Reduced  Set  of  Transitivity  Constraints 

One  advantage  of  OBDDs  over  other  representations  of  Boolean  functions  is  that  we  can  readily 
determine  the  true  support  of  the  function,  i.e.,  the  set  of  variables  on  which  the  function  depends. 
This  leads  to  a  strategy  of  computing  an  OBDD  representation  of  Fs at  and  intersecting  its  support 
with  £  to  give  a  set  £  of  relational  variables  that  could  potentially  lead  to  transitivity  violations. 
We  then  augment  these  variables  to  make  the  graph  chordal,  yielding  a  set  of  variables  £+  and 
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Circuit 

Verts. 

Edges 

Direct 

Cycles 

Clauses 

Edges 

Dense 

Cycles 

Clauses 

Edges 

Sparse 

Cycles 

Clauses 

1  xDLX-C-t 

9 

18 

14 

45 

36 

84 

252 

20 

19 

57 

2xDLX-CA-t 

17 

44 

101 

395 

136 

680 

2,040 

49 

57 

171 

2xDLX-CC-t 

17 

46 

108 

417 

136 

680 

2,040 

52 

66 

198 

Reduced 

min. 

3 

2 

0 

0 

3 

1 

3 

2 

0 

0 

Buggy 

avg. 

12 

17 

19 

75 

73 

303 

910 

21 

14 

42 

2xDLX-CC 

max. 

19 

52 

378 

1,512 

171 

969 

2,907 

68 

140 

420 

Table  4:  Graphs  for  Reduced  Transitivity  Constraints.  Results  are  given  for  the  three  different 
methods  of  encoding  transitivity  constraints  based  on  the  variables  in  the  true  support  of  Fsa[. 


Circuit 

^sat 

OBDD  Nodes 
^trans(£+)  ^sat 

A  ^trans(^+) 

CPU 

Secs. 

1  xDLX-C 

1 

1 

1 

0.2 

1  xDLX-C-t 

530 

344 

1 

2 

2  x  DLX-CA 

1 

1 

1 

11 

2  x  DLX-CA-t 

22,491 

10,656 

1 

109 

2  x  DLX-CC 

1 

1 

1 

29 

2  x  DLX-CC-t 

17,079 

7,168 

1 

441 

Reduced 

min. 

20 

1 

20 

7 

Buggy 

avg. 

3,173 

1,483 

25,057 

107 

2  x  DLX-CC 

max. 

15,784 

93,937 

438,870 

2,466 

Table  5:  OBDD-based  Verification.  Transitivity  constraints  were  generated  for  a  reduced  set  of 
variables  £. 

generate  an  OBDD  representation  of  f'trans(^+)-  We  compute  Fsat  A  Ttrans(^+)  ar|d,  if  it  is 
satisfiable,  generate  a  counterexample. 

Table  4  shows  the  complexity  of  the  graphs  generated  by  this  method  for  our  benchmark  cir¬ 
cuits.  Comparing  these  with  the  full  graphs  shown  in  Table  2,  we  see  that  we  typically  reduce  the 
number  of  relational  vertices  (i.e.,  edges)  by  a  factor  of  3  for  the  benchmarks  modified  to  require 
transitivity  and  by  an  even  greater  factor  for  the  buggy  circuit  benchmarks.  The  resulting  graphs 
are  also  very  sparse.  For  example,  we  can  see  that  both  the  direct  and  sparse  methods  of  encoding 
transitivity  constraints  greatly  outperform  the  dense  method. 

Table  5  shows  the  complexity  of  applying  the  OBDD-based  method  to  all  of  our  bench¬ 
marks.  The  original  circuits  lxDLX-C,  2xDLX-CA,  and  2xDLX-CC  yielded  formulas  Fsai 
that  were  unsatisfiable,  and  hence  no  transitivity  constraints  were  required.  The  3  modified  cir¬ 
cuits  1  xDLX-C-t,  2xDLX-CA-t,  and  2xDLX-CC-t  are  more  interesting.  The  reduction  in  the 
number  of  relational  variables  makes  it  feasible  to  generate  an  OBDD  representation  of  the  tran¬ 
sitivity  constraints.  Compared  to  benchmarks  1  xDLX-C,  2xDLX-CA,  and  2xDLX-CC,  we  see 
there  is  a  significant,  although  tolerable,  increase  in  the  computational  requirement  to  verify  the 
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modified  circuits.  This  can  be  attributed  to  both  the  more  complex  control  logic  and  to  the  need  to 
apply  the  transitivity  constraints. 

For  the  100  buggy  variants  of  2xDLX-CC,  ^sat  depends  on  up  to  52  relational  variables, 
with  an  average  of  17.  This  yielded  OBDDs  for  Ttrans(^+)  ranging  up  to  93,937  nodes,  with 
an  average  of  1,483.  The  OBDDs  for  Fsat  A  Ftrans(£+)  ranged  up  to  438,870  nodes  (average 
25,057),  showing  that  adding  transitivity  constraints  does  significantly  increase  the  complexity  of 
the  OBDD  representation.  However,  this  is  just  one  OBDD  at  the  end  of  a  sequence  of  OBDD 
operations.  In  the  worst  case,  imposing  transitivity  constraints  increased  the  total  CPU  time  by  a 
factor  of  2,  but  on  average  it  only  increased  by  2%.  The  memory  required  to  generate  Fsai  ranged 
from  9.8  to  50.9  MB  (average  15.5),  but  even  in  the  worst  case  the  total  memory  requirement 
increased  by  only  2%. 


6  Conclusion 

By  formulating  a  graphical  interpretation  of  the  relational  variables,  we  have  shown  that  we  can 
generate  a  set  of  clauses  expressing  the  transitivity  constraints  that  exploits  the  sparse  structure 
of  the  relation.  Adding  relational  variables  to  make  the  graph  chordal  eliminates  the  theoreti¬ 
cal  possibility  of  there  being  an  exponential  number  of  clauses  and  also  works  well  in  practice. 
A  conventional  SAT  checker  can  then  solve  constrained  satisfiability  problems,  although  the  run 
times  increase  significantly  compared  to  unconstrained  satisfiability.  Our  best  results  were  ob¬ 
tained  using  OBDDs.  By  considering  only  the  relational  variables  in  the  true  support  of  Fsat,  we 
can  enforce  transitivity  constraints  with  only  a  small  increase  in  CPU  time. 
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